Automatic categorization for improving Spanish into Spanish Sign Language machine translation
نویسندگان
چکیده
This paper describes a preprocessing module for improving the performance of a Spanish into Spanish Sign Language (Lengua de Signos Española: LSE) translation system when dealing with sparse training data. This preprocessing module replaces Spanish words with associated tags. The list with Spanish words (vocabulary) and associated tags used by this module is computed automatically considering those signs that show the highest probability of being the translation of every Spanish word. This automatic tag extraction has been compared to a manual strategy achieving almost the same improvement. In this analysis, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not assigned to any sign. The preprocessing module has been incorporated into two well-known statistical translation architectures: a phrasebased system and a Statistical Finite State Transducer (SFST). This system has been developed for a specific application domain: the renewal of Identity Documents and Driver’s License. In order to evaluate the system a parallel corpus made up of 4,080 Spanish sentences and their LSE translation has been used. The evaluation results revealed a significant performance improvement when including this preprocessing module. In the phrase-based system, the proposed module has given rise to an increase in BLEU (Bilingual Evaluation Understudy) from 73.8% to 81.0% and an increase in the human evaluation score from 0.64 to 0.83. In the case of SFST, BLEU increased from 70.6% to 78.4% and the human evaluation score from 0.65 to 0.82.
منابع مشابه
Source Language Categorization for improving a Speech into Sign Language Translation System
This paper describes a categorization module for improving the performance of a Spanish into Spanish Sign Language (LSE) translation system. This categorization module replaces Spanish words with associated tags. When implementing this module, several alternatives for dealing with non-relevant words have been studied. Nonrelevant words are Spanish words not relevant in the translation process. ...
متن کاملImproving a Catalan-Spanish Statistical Translation System using Morphosyntactic Knowledge
In this paper, a human evaluation of a Catalan-Spanish Ngram-based statistical machine translation system is used to develop specific techniques based on the use of grammatical categories, lexical categorisation and text processing, for the enhancement of the final translation. The system is successfully improved when testing with ad hoc and general corpora, as it is shown in the final automati...
متن کاملAutomatic Translation System to Spanish Sing Language with a Virtual Interpreter
In this paper, an automatic translation system from Spanish language into Spanish Sign Language (LSE) performed by a virtual interpreter is presented. The translator is based on rules from Spanish grammar considering the syntactical and morphological characteristics of words and the semantics of their meaning. The system has been incorporated to an animation engine in which a virtual character ...
متن کاملAn On-Line, Cloud-Based Spanish-Spanish Sign Language Translation System
An on-line Spanish-Spanish Sign Language (LSE) translation system is presented in which Spanish speech content is translated into LSE to provide Spanish deaf people access to speech information. It is cloud-based, built over a speech recognition module, a transfer-based machine translation module and a Sign Language synthesis module that employs an avatar to present the signed content.
متن کاملAutomatic Translation System to Spanish Sign Language with a Virtual Interpreter
In this paper, an automatic translation system from Spanish language into Spanish Sign Language (LSE) performed by a virtual interpreter is presented. The translator is based on rules from Spanish grammar considering the syntactical and morphological characteristics of words and the semantics of their meaning. The system has been incorporated to an animation engine in which a virtual character ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 26 شماره
صفحات -
تاریخ انتشار 2012